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DRAFT 
Abstract 

In this note, we show that small complex perturbations of positive matrices are 
contractions, with respect to a complex version of the Hilbert metric, on the standard 
complex simplex. We show that this metric can be used to obtain estimates of the 
domain of analyticity of entropy rate for a hidden Markov process when the underlying 
Markov chain has strictly positive transition probabilities. 

The purpose of this note is twofold. First, in Section (TJ we introduce a complex version 
of the Hilbert metric on the standard real simplex. This metric is defined on a complex 
neighbourhood of the interior of the standard real simplex, within the standard complex 
simplex. We show that if the neighbourhood is sufficiently small, then for any sufficiently 
small complex perturbation of a strictly positive square matrix acts as a contraction, with 
respect to this metric. While this paper was nearing completion, we were informed of a 
different complex Hilbert metric, which was recently introduced. We briefly discuss the 
relation between this metric |2j and our metric in Remark 11.61 

Secondly, we show how one can use a complex Hilbert metric to obtain lower estimates of 
the domain of analyticity of entropy rate for a hidden Markov process when the underlying 
Markov chain has strictly positive transition probabilities. The domain of analyticity is 
important because it specifies an explicit region where a Taylor series converges to the 
entropy rate and also gives an explicit estimate on the rate of convergence of the Taylor 
approximation. 

In principle, an estimate on the domain can be obtained by examining the proof of 
analyticity in [5]. That proof was based on a contraction mapping argument, using the fact 
that the real Euclidean metric is equivalent to the real Hilbert metric. In Section 12.11 we 



revisit certain aspects of the proof and outline how to modify the proof using a complex 
Hilbert metric; this yields a more direct estimate. In Section I2.2[ we illustrate this with a 
small example, using our Hilbert metric. 

We remark that the entropy rate of a hidden Markov process can be interpreted as a top 
Lyapunov exponent for a random matrix product [I] . In principle, a complex Hilbert metric 
can be used, more generally, to estimate the domain of analyticity of the top Lyapunov 
exponent for certain random matrix products; see [8], [9]. 



1 Complex Hilbert Metric 

We begin with a review of the real Hilbert metric. Let B be a positive integer, and let W 
be the standard simplex in B- dimensional real Euclidean space: 

W = {w = (w h w 2 , ■ ■ ■ , w B ) e R B : Wi > 0, w i = !}> 

i 

and let W° denote its interior, consisting of the vectors with positive coordinates. For any 
two vectors v, w G W°, the Hilbert metric [12J is defined as 

... / Wi/Wj\ _ . 

d H {w,v) = max log — j—^ . (1) 



Vi/V 



For a B x B strictly positive matrix T = (t^), the mapping fa induced by T on W is 
defined by fr{w) = wT/(wTl), where 1 is the all-ones vector. It is well known that fx is a 
contraction mapping under the Hilbert metric [12]. The contraction coefficient of T, which 
is also called the Birkhoff coefficient, is given by: 

r m = sup MvT,wT) = i-y/W) (2) 

d H {v,w) 1 + y/<j>{T) ' 

where MT) = minj ,■ ^ / hkJL_ This result extends to the case where T has all columns strictly 

positive or all zero and at least one strictly positive column (then, in the definition of 4>{T), 
consider only k, I corresponding to strictly positive columns). 

Let Wc denote the complex version of W, i.e., Wc denotes the complex simplex compris- 
ing the vectors 

{w = (wi, w 2 , ■ ■ ■ , w B ) e C B : 2J Wi = 1}- 

i 

Let II V {r ; \V : : TZ{vi) > 0}. For v, w G W£, let 

'Wi/Wj 



dn(v, w) = max 



hi 



log 



Vi/Vj 



(3) 



where log is taken as the principal branch of the complex log(-) function (i.e., the branch 
whose branch cut is the negative real axis) . Since the principal branch of log is additive on 
the right-half plane, du is a metric on which we call a complex Hilbert metric. 
We begin with the following very simple lemma. 
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Lemma 1.1. Let n > 2. For any fixed z±, z 2 , • • ■ , z n , z G C and fixed t > 0, we have 
sup \tiZi + t 2 z 2 + • • ■ + t n z n + z\ — max \tz{ + z\ . 

h,-,t n >0, t 1 +t 2 + -+t n =t i=l,-,n 

Proof. The convex hull of z±, z 2 , ■ ■ ■ , z n is a solid polygon, taking the form 

{(ti/t)zi + (t 2 /t)z 2 + ■■■ + (t n /t)z n : t x , t 2 , ■ ■ ■ , t n > 0, h + 1 2 + • • • + t n = t}. 

By convexity, the distance from any point in this solid polygon to the point (—l/t)z will 
achieve the maximum at one of the extreme points, namely 

sup \(t l /t)z 1 + (t 2 /t)z 2 H h (t n /t)z n - (-l/t)z\ — max \z { - {—l/t)z\ . 

tl,- ,tn>0,tl+t 2 + -+tn=t l=1 >- ' n 

The lemma then immediately follows. 

□ 



The following lemma is implied by the proof of Lemma 2.1 of [10] ; we give a proof for 
completeness. 

Lemma 1.2. For fixed a%, a 2 , ■ ■ ■ , as > G K and fixed x%, x 2 ■ ■ ■ , xb > G IR ; define: 

Qj n X n X n 
2^m=l a m X m 2^m=l 

Lei T = {n : D n > 0} and 7^ = {n : _D n < 0}. Then we have 



Td =T\d k 1 "^ 



where a = minjax, a 2 , ■ ■ ■ , Ob} and A = max{a 1; a 2 , ■ ■ ■ , Ob}. 

Proof. It immediately follows from X)n=i = anc ^ the definitions of To and T\ that 

£a» = x>»i. 

Now 



iGT a m x m + SmGTi a m x m Y2m&T X m + X]me7I 



< 



E 



/\ qp rjf> 



Let 

z = ^ , 

2^meT X m 

we then have 

T, D ^ 1 + i 1 a/A)z -TTz- = f{z) - 

neT v ' > 

Simple calculus shows that f(z) will be bounded above by on [0, oo). This establishes 

the lemma. □ 
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Let Wq(S) denote the "relative" (^-neighborhood of W° in Wc, i.e., 

W£(6) = {v = (v u v 2 , ■ ■ ■ , v B ) eW c :3ue W°, \ Vi -Ui\< 6\ Ui \,i = 1, 2, • • • , B}. 

Note that when 5 < 1, W£(5) C and so the complex Hilbert metric is defined on W£(5). 

We consider complex matrices T = (t^) which are perturbations of a strictly positive 
matrix T = (tij). For such a matrix T and r > 0, let B T (r) denote the set of all complex 
matrices T such that for all 

I ^ij ij I — ^ ' 

With the aid of the above lemmas, we shall prove: 

Theorem 1.3. Let T be a strictly positive matrix. There exist r, 5 > such that whenever 
T G Bt(t), ff is a contraction mapping on W£(6) under the complex Hilbert metric. 

Proof. For x, y G W£, x ^ y, and let 

L ij — 



max fc)/ \\og(x k /y k ) - \og(xi/yi)\ 
Note that 

d H (xf,yf) 

, ^ = max \Lij\. 
d H (x,y) *j 

It suffices to prove that there exists < p < 1 such that for sufficiently small r,5 > 0, 
ye Wc(<J), x^y,fe B T (r), and any i, j, 

\Lij\ < p. 

For each m, let c m = \ogx m /y m ; then x m = y m e Cm . Choose p ^ q such that 



Hence: 



Define 



Since 



we have 



\ c p ~ c q\ = max | c k — C\ | . 

' k,l 



T ME m £r»e Cm Cg ?W^ m y m e c - c «T mj ) -\og(Y, m y m T mi /y m T mj ) 

^ ~ 1 : _ ~ I ' 

I I 



F(t) = log(^s/ ro e^-^)*f mi / ^^e^-^f^-)- 



|F(1)-F(0)| 



F'(i)dt 



< max 

~ Ce[o,i] 



= 1^(1) -^(0)1 < maxg g[0 ,i] 1^(01 _ 



| Cp Cq | | Cq | 
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Note that F'(£) takes the following form: 



f'(0 



V (r — r )v p( d ™- d iKf ■ V fc -c)« e (c m -c q Kf 

/ j m \ '-m ^qjym^ ± mi / , m \ ^q/ym^ - 1 ? 



Now for all m let a m = T mi jT m y Then 

no 



rnj 



E 



fi p(.Cn-Cq)iT 



7IJ 



Cr, — cJ \C V — Co I V 77 P (c m -c 9 )^ T 1 . ^ 7} f>(cm-c q )£rh 

V 1\ n \ V 91 \ Z_/m Uj rn- L mj L^im amy ± i 



^— ' \r_ — 



(5) 



where D n denotes the quantity in parentheses in the middle expression above. 

Let i,i/6 W° such that for all k, \ctk — x^\ < and — 2/fc| < $\Vk\- Let a m = T mi /T m j, 
c m = \ogx m /y m , and let D n denote the unperturbed version of D n \ 



D, 



v p(Cn-Cq)£ n rp 



y n e 



(c„-C q )£rp 



Y^ m ym^ Cm c ^a m T mj Y, m y™z {Cm Cq)i % 



(6) 



By Lemma [1.21 we have 



£ D n = J2 \Dn\ < max ?-V^M < 



1 + ^a k /a t 



(7) 



where 7o = {n : D n > 0} and T\ = {n : D n < 0}. 
Now, for some universal constant K , 



E 



C r , Cn 



<^o(5 + r). 



Applying Lemma [TTT1 twice, we conclude that there exist uq G 7q, m & Ti such that 



I Cg | 



< 



neTi 



< 



E°» 



Then together with (TjJ, ([5]), ([8]), ([7]), and the fact that \c ni — c no \ < \c p — c q \, we obtain that 
for sufficiently small r, 6 > 0, is upper bounded by some p < 1, as desired. □ 

Remark 1.4. One can further choose r, 5 > such that when T 6 Bx(r), ff(W^{5)) C 
Wc(<J). Consider a compact subset N C W° such that / T (W) C N. Let A(i2) denote the 
Euclidean ^-neighborhood of N in Wc- The proof of Theorem 11.31 implies that when T > 
or (T > and sup I . ygiVi0<i t <1 Ylm&r ®n < 1 (here D n is defined in (IS]))), there exist r, R > 
such that when T £ Bt(t), ff is a contraction mapping on N(R) under the complex Hilbert 
metric. 
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Example 1.5. Consider a 2 x 2 strictly positive matrix 



T 



a c 
b d 



If we parameterize the interior of the simplex W° by (0, oo): w = (x,y) i— ► x/y, then letting 
z = x/y, we have: fr(z) = ff^; the domain of this mapping naturally extends from (0, oo) 
to the open right half complex plane H, and the complex Hilbert metric becomes simply 
d H (z 1 ,z 2 ) = | log(zi/z 2 )\. 

One can show that Jt is a contraction on all of H with contraction coefficient: 



r(T) 



bc_ 

ad_ 

1 + 3 



(assuming det(T) > 0; otherwise, the last expression is replaced by — -| 
any z,w G H , consider 

log(/ T (^)) - log(/ T (w)) 



To see this, for 



log(z) - log(w) 
With change of variables u = \og(z),v = \og(w), we have 



L 



which implies that 



log(/r(e u )) - log(/ r (e«)) 



1 f ( P v+t(u-v)\ 
v+t(u-v) JT\ C )_ 

f T ( e v+t(u-v)j 



dt 



L < sup 



zf T {z) 



A simple computation shows that 

zf T {z) 



Mz) 



ad — be 



fr{z) acz + (ad + be) + bd/z 



(9) 



To see that the supremum is :j— ff, first note that since ad — be > and a,b,c,d > 0, 



the absolute value of the quantity on the right-hand side of ([9]) is maximized by minimizing 
\acz + bd/z\] since the only solutions to acz + bd/z = are z = ±.i^Jbd/ 'ac, one sees that 
the supremum is obtained by substituting z = ±_i^bd/ac into and this shows that the 

supremum is indeed 



ad 

Note that this contraction coefficient on H is strictly larger (i.e., worse) than the con- 
traction coefficient on [0, oo): — 

' l- 

When 

T 



Abe 
a<l 

ad 



a c 
b d 



is a sufficiently small complex perturbation of T, then ff(H) C if and one obtains 



r(T) 



sup 



ff(z) 



sup 



ad — be 



acz + (ad + be) + bd/z 



which will approximate — ff-, and so ff will still be a contraction on H. 

Remark 1.6. While this paper was nearing completion, we were informed that alternative 
complex Hilbert metrics, based on the Poincare metric in the right-half complex plane, 
were recently introduced in Rugh [TT] and Dubois [2]. Contractiveness with respect to these 
metrics is proven in great generality and yields far-reaching consequences for complex Perron- 
Frobenius theory. The proofs of contractiveness in these papers seem rather different from 
the calculus approach in our paper. 

The complex Hilbert metric, which we call dp, used in [2j (see equation (3.23)) is explicit 
and natural, but slightly more complicated than our complex Hilbert metric; for v, w G , 

dp(w,v ) = log . ,J _L i wervt— ( 10 ) 

mmij{\WiVj + WjVi\ — \WiVj — WjVi\){2l<,{WiWj)) 1 

here z denotes complex conjugate, TZ(z) denotes real part, and log is the ordinary real 
logarithm. In the 2-dimensional case, it can be verified that, if one transforms w = {w\,W2) 
and v = (vi,v 2 ) to Z\ = w 2 jw\ and z 2 = w 2 /wi, then dp reduces to the Poincare metric on 
H: 

d P (z 1 ,z 2 ) = log 



\zi + z 2 


+ 


Z\ - z 2 \ 


\zi + z 2 




z\ - z 2 \ 



Using the infinitesimal form for the Poincare metric (as a Riemannian metric on H), one 
checks that, in the 2x2 case, the Lipschitz constant for a complex matrix T such that 
f f (H) C H is: 

n{z)f f {z) 



sup 



in contrast to 



sup 



zf f {z) 



(11) 

(12) 



/*(*) 

for our complex Hilbert metric (as in Example 11.51 above). 

While we have not analyzed in detail the differences between these metrics, there are a 
few things that can be said in the 2x2 case: 

• ff is a contraction with respect to dp on H whenever it maps H into its interior; this 
follows from standard complex analysis (section IX. 3 of of |3J), and Dubois [2] proves 
an analog of this for the metric dp above (|T0|) in higher dimensions. However, this 
does not hold for da- 

• When T = T is strictly positive, then the contraction coefficient, with respect to dp, 
is always at least as good (i.e., at most) the contraction coefficient with respect to d H . 
This can be seen as follows: 

First recall that any fractional linear transformation T can be expressed as the compo- 
sition of transitions, dilations and inversions. In the case where T is strictly positive, 
the translations are by positive real numbers and the dilations are by real numbers; 
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see page 65 of [3] . Using the infinitesimal forms (ITTl IT2]) , our assertion would follow 
from: 

z 

This is true indeed: it is easy to see that in fact we get equality in ([13]) for inversions 
and dilations by real numbers, and we get strict inequality in ( TT31 for translations by 
positive real numbers. 

• When T is a complex perturbation of a strictly positive T, then (1X3"]) (with T replaced 
by T) need not hold; in fact, for perturbations T of T on the order of 1% and z = 
x + yi G H, with \y\/x on the order of 1%, the contraction coefficient with respect to 
dn may be slightly smaller than that with respect to dp. The reason is that in this 
case, the dilations may be complex (non-real) and for such a dilation the inequality 
ffTB"]) may be reversed. Examples of this can be randomly generated in Matlab. For 
example, if 

~ _ T 0.012890500224 + 0.000128905002i 0.310402226067 + 0.003104022260i ' 
~~ [ 0.779079247486 - 0.007790792474i 0.307296084921 - 0.003072960849i 

and z = 0.926678310631 - 0.009266783106i, then the contraction coefficent of d H is 
approximately 0.664396 and that of dp is approximately 0.664599. For larger pertur- 
bations, the differences in contraction coefficient can be greater. The relative strength 
of contraction of d H , dp seems to be heavily dependent on specific choices of T and z. 

• For any point z, other than 0, of the imaginary axis, the metric dn can be extended to 
a neighbourhood, with respect to which any sufficiently small complex perturbation T 
of a strictly positive matrix acts as a contraction; on the other hand, there is no way 
to do this with dp since it blows up as one approaches the imaginary axis. 

• Also, on a small punctured neighbourhood of 0, we replace dn by the metric d(zi, z 2 ) = 
| log(zi) — log(z2)|, then small complex perturbation T of a strictly positive matrix still 
acts as a contraction. 

In the next section, we use dp; for estimates on the domain of analyticity of entropy 
rate of a hidden Markov process. Alternatively, dp could be used, however it appears to be 
computationally easier to use dn for the estimation. 



< 



Mz) 



for all z G H. 



(13) 



2 Domain of Analyticity of Entropy Rate of Hidden 
Markov Processes 

2.1 Background 

For m,n6Z with m < n, we denote a sequence of symbols y m , y m+ i, . . . , y n by yj^. Consider 
a stationary stochastic process Y with a finite set of states I = {1, 2, • ■ ■ , B} and distribution 



S 



p(Vm)- Denote the conditional distributions by p(y n+ i\y'^ l ) . The entropy rate of Y is defined 



as 



H(Y) = lim -E p (log(p(yo\ylk))), 



where E p denotes expectation with respect to the distribution p. 
Let Y be a stationary first order Markov chain with 

=p(yi =j\y = i). 

In this section, we only consider the case when A is strictly positive. 

A hidden Markov process (HMP) Z is a process of the form Z = &(Y), where $ is a 
function defined on X = {1, 2, • • ■ , B} with values in J — {1, 2, • • • , A}. 

Recall that W is the S-dimensional real simplex and Wc is the complex version of W. 
For a G J7", let X(a) denote the set of all indexes i G X with $(i) = a. Let 

Wa = {w E W : Wi = whenever i G" X(a)} 

and 

Wa,c = G Wc : = whenever z G" X(a)}. 

Let A a denote the BxB matrix such that A a (i,j) = A(i,j) for j G 1(a), and A a (i,j) = 
for j ^ X(a) (i.e, A a is formed from A by "zeroing out" the columns corresponding to indices 
that are not in 1(a). the For a G J , define the scalar- valued and vector- valued functions r a 
and f a on W by 

r a (w) = wA a l, 

and 

f a (w) = wA a /r a (w). 

Note that f a defines the action of the matrix A a on the simplex W. For any fixed n and z°_ n 
and for i = —n, —n + 1, • • • , define 

Xi = Xi(z l _ n ) = p(y { = ■ \z h Zi_ x , ■■■ , z_ n ), (14) 

(here • represent the states of the Markov chain Y); then from Blackwell pQ, we have that 
{x^ satisfies the random dynamical iteration 

x i+l = fz i+1 ( x i)i (15) 

starting with 

x_ n _i = p(y- n -i = ■ ). (16) 

where p(y- n -i — • ) is the stationary distribution for the underlying Markov chain. One 
checks that p(z i+ i\z l _ n ) can be recovered from this dynamical system; more specifically, we 
have 

p(z i+1 \z l _ n ) = r Zt+1 (xi). 

If the entries of A = A e are analytically parameterized by a real variable vector e G M fc (k 
is a positive integer), then we obtain a family Z = Z e and corresponding A a = A^, f a = 
etc. 

The following result was proven in [5]. 
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Theorem 2.1. Suppose that the entries of A = A E are analytically parameterized by a real 
variable vector e. If at e = £q, A is strictly positive, then H(Z) = H(Z^) is a real analytic 
function of e at Eq. 

In [5] this result is stated in greater generality, allowing some entries of A to be zero. 
The proof is based on an analysis of the action of perturbations of f a on neighbourhoods 

of Wi, = /&(W), with respect to the Euclidean metric. The proof assumes that each /„ 
is a contraction on each Wf,. While this need not hold, one can arrange for this to be 
true by replacing the original system with a higher power system: namely, one replaces the 
original alphabet J with J n for some n and replaces the mappings {f a : a £ J} with 
{fa fai ° ■ " • ° /a„_i : a o a i ■ ■ ■ a n~i G J"*}- The existence of such an n follows from a) the 
equivalence of the (real) Hilbert metric and the Euclidean metric on each Wb (Proposition 
2.1 of [5]) and b) the contractiveness of each f a with respect to the (real) Hilbert metric. 
However, in the course of this replacement, one easily loses track of the domain of analyticity. 

When at e = Eq, A is strictly positive, an alternative is to directly use a complex Hilbert 
metric, as follows. For each a £ J7, we can define a complex Hilbert metric d a ,H on W° c as 
follows: for w, v £ W° c : 



d a js(w,v) = d H (w J{a) ,v J{a) ) = max 



log 



Vi/Vj 



(17) 



Theorem 11.31 implies that for each a, b £ J7", sufficiently small perturbations of f a are con- 
tractions on sufficiently small complex neighborhoods of Wb in Wb,c] see Remark 11.41 (note 
that while A a is not strictly positive, f a maps into W a and so as a mapping from Wb to W a 
it can be regarded as the induced mapping of a strictly positive matrix) . For complex e close 
to Eo, f a = fl is sufficiently close to f%° to guarantee that f e a is a contraction. 

Let fl a} H{R) denote the neighborhood of diameter R, measured in the complex Hilbert 
metric, of W a in W a> c- Let B? (r) denote the complex r-neighborhood of Eq in C fe . 

Following the proof of Theorem l2.ll (especially pages 5254-5255 of [5]), one obtains a lower 
bound r > on the domain of analyticity if there exists R > and < p < 1 satisfying the 
following conditions: 

1. For any a, z £ A and any e £ Bg Q {r), /J is a contraction, with respect to the complex 
Hilbert metric, on ft aj ji(R): 



sup 



d z , H (f!(x)JI(y)) 



d a ,H(x,y) 



< p< 1. 



2. for any e £ Bg (r), any x £ U a W a and any z £ A, 

d z , H (f!(x),f!°(x))<R(l-p), 

and 

d ZtH (rMe)),f!°(ir(eo)))<R(l-p), 
(where tt(e) denotes the stationary vector for the Markov chain defined by A^). 
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3. For any x G Q^h(R) and e G B? (r), 

^2\r{{x)\<l/p. 

a 

The existence of r,R,p follows from Theorem 12.11 In fact, we can choose p to be any 
positive number such that max ae ^r(A a ) < p < 1, and small r, R to satisfy condition 1, then 
smaller r, R, if necessary, to further satisfy conditions 2 and 3. 

Let Q at E(R) denote the neighborhood of diameter R, measured in the Euclidean metric, 
of W a in W a c- To facilitate the computation, at the expense of obtaining a smaller lower 
bound, it may be easier to use VL a ,E{.R) instead of Q a ,H(R)', then, the conditions above are 
replaced with the following conditions: 

(1') Condition 1 above with fl atH (R) replaced by Q a ,E(R) (the map /| is still required to 
be a contraction under the complex Hilbert metric). 

(2') Condition 2 above with R on the right hand side of the inequalities replaced by R/K, 



where K = sup x _^ ef > o B ^ R)>a 



d a ,E(x,y) 



note that for R sufficiently small, < K < oo 



da,H( x ,y) 

since d a ^H and d a ^E are equivalent metrics (this in turn follows from the fact that the 
Euclidean metric and (real) Hilbert metric are equivalent on any compact subset of 
the interior of the real simplex). 

(3') Condition 3 above with Q aH (R) replaced by Q aE (R) 



2.2 Example for Domain of Analyticity 



In the following, we consider hidden Markov processes obtained by passing binary Markov 
chains through binary symmetric channels with crossover probability e. Suppose that the 
Markov chain is defined by a 2 x 2 stochastic matrix II = [71^]. From now through the end 
of this section, we assume: 

• det(IT) > - and - 

• all TTij > - and - 

• < e < 1/2. 

We remark that the condition det(II) > is purely for convenience. 

Strictly speaking, the underlying Markov process of the resulting hidden Markov process 
is given by a 4-state matrix (the states are the ordered pairs of a state of IT and a noise state 
(0 for "noise off" and 1 for "noise on"); see page 5255 of [5]). However, the information 
contained in each f a can be reduced to an equivalent map induced by a 2 x 2 matrix and 
then reduced to an equivalent function of a single variable as in Example 11.51 We describe 
this as follows. 

Let cii = p(z\,yi = 0) and hi = p(z\,yi = 1). The pair (dj, bi) satisfies the following 
dynamical system: 



(a,i,bi) = (ai^b^i) 



PE{zi)-KQl PE(Zi)7Tu 
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where pe(0) = e and = 1 — s. 

Similar to Example II .5[ let Xj = aj/6j, we have a dynamical system with just one variable: 



where 



starting with 



flip) 



X i+1 — fz i+1 ( X i)i 

Pe{z) 7TQQX + TTiQ 
Pe{z) 7T 01 X + 7Tn ' 

= ttio/tto!) 



2 = 0,1 



which comes from the stationary vector of EE. 
It can be shown that 



p% Zi = Olzt 1 ) = rg^i-O, P%Zi = l^i ) = 



where 



and 



rg(x) 



rf(x) 



((1 - g)7T o + £7Toi)a: + ((1 - £)7Tiq + £7Tn) 
X + 1 

(ctqq + (1 - e)7!- 01 )x + (£7ri + (1 - e)7Tii) 
X + 1 



(19) 



(20) 



Now let Q(R) denote the complex R- neighborhood (in Euclidean metric) of the interval 

e n w (1 - e )7r 00 



S=[S U S 2 ] 



_(1-£ W ^0^01 



this interval is the union of /o°([0, oo]) and /i°([0, oo]); again let Bg (r) denote the complex 
r-neighborhood of a given cross-over probability Eq > 0. 

The sufficient conditions (1'), (2') and (3') in section I2TT1 are guaranteed by the following: 
there exist _R>0,r>0,0<p<l such that 

(1") For any z, fl(x) is a contraction on Q(R) under complex Hilbert metric, 

\ogf £ z (x) - log fl(y) 



sup 



log x — log y 



< P< 1. 



Note that here 



, r £ / \ i tei \ i ^OO^ + ^lO i ^OOZ/ + ^lO 

lo g fz ( x ) - lo g fz \y) = log ; log ; — • 

TT 01 X + n u VToiy + TTn 

^2") For any e G Bg (r), any x G S and any z, 

| log/* (x) - log/ 2 e °(x)| < (R/K)(l - p), 



where 



K = sup 



x-y 



sup \x\ = 5*2 + R. 

xen(R) 



log x — log y 

(note that here the second condition in (2') is vacuous since by (TT5T) x does not depend 
on e) 
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(3") For any x e Q(R) and s G B e - (r), 

\tq(x)\ + |rf (x) | < 1/p. 

By considering extreme cases, the above conditions can be further relaxed to: 



I » M 



(2"') 



7Toi7roo(»S'l — -R) + Vr l7Tio + 7Tii7Too + TTlX^Vo/ (^2 + -R) _ 

(here we applied the mean value theorem to give an upper bound on | log((7r oa; + 

Tr W )/{ltolX + 7Tn)) - log((7r 00 y + VTio)/(7r iy + 7Tn))|) 

< — ^— + < (R/(S 2 + R))(l - p). 

e -r l-£ -r 

(here we applied the mean value theorem to give an upper bound on | log((l — e)/e) — 
bg((l-e )/e )|) 

(3"') 

((1 - £ + r)7T 00 + (e + r)7r i)(5 , 2 + g) + ((1 - g + r)7Ti + (e + r)7Tn) 

Sx-R+1 

((go + r)7r 00 + (1 - e + r)7r i)(5 , 2 + g) + ((go + r)7ri + (1 - e + r)irn) 

St-R+1 ~ /P ' 

In other words, choose r,R and p to satisfy the conditions (1"'), (2"') and (3'"). Then 
the entropy rate is an analytic function of e on \e — Eq\ < r. 

Consider the symmetric case: 7r 00 = tt u = p and 7r i = 7r 10 = 1 — p. We plot lower bounds 
on radius of convergence of H(Z) (as a function of £ at e = 0.4) against p in Figured) For 
a fixed p, the lower bound is obtained by randomly generating many 3-tuples (r, R, p) and 
taking the maximal r from the 3-tuples which satisfy the conditions. 

Acknowledgements: We thank Albert Chau for helpful discussions on Riemannian 
metrics in H. 
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